Distributed System — Appunti TiTilda

Indice

Introduction

A distributed system is a collection of independent computer that appears to the users as a single computer.

Some key aspects of distributed systems are:

Distributed System Architecture

High-Level Architecture

A distributed system is composed of multiple machines connected through a network. Those machines can be:

Runtime Architecture

A distributed system can be classified based on the components, the connections, the data exchange, and the interaction between them.

Client-Server

This is the most common architecture.

It is based on two component:

In distributed system one server can use a service from other servers creating a multi-tier (N-tier) infrastructure.

The services that can be distributed are:

This architecture is easy to manage and scale, but the server is a single point of failure and a bottleneck for the system.

Service Oriented

The service oriented architecture is based on three components:

The interface is described with a standard language (e.g. WSDL).

The communication is done through a standard protocol (e.g. SOAP).

REST

REpresentional State Transfer (REST) is an architectural style that define how web standards should be used.

The interaction happen between a client and a server. The interaction is stateless, meaning that each request from the client to the server must contain all the information needed to understand and process the request.

The response must be explicitly labeled as cacheable or non-cacheable.

The server interface must satisfy four constraints:

  1. Identification of resources: Each resource is identified by a unique URI;
  2. Manipulation of resources through representations: The communication is done through representations of the resource (e.g. JSON, XML, HTML), selected dynamically based on the client needs;
  3. Self-descriptive messages: Each message contains metadata about how to process it (e.g. HTTP headers);
  4. Hypermedia as the engine of application state: The server provides links to other resources dynamically, allowing clients to navigate the application state.

Peer-to-Peer

All the components in a peer-to-peer architecture are equal and can act as both client and server.

A server, in a client-server architecture, represents a single point of failure and a bottleneck for the system. In a peer-to-peer architecture, the failure of a single node does not affect the overall system, as other nodes can take over its responsibilities.

Object-Oriented

An object-oriented architecture is based on the concept of objects that encapsulate data and behavior. The objects provide an interface to interact with them.

Data-Centred

A data-centered architecture is based on a shared data space (Repository - tuple space or shared global state) that is accessible by all the components in the system.

Data can be added or read from the repository.

When performing a read operation, the component can either take from the repository (destructive read) or read the data without removing it (non-destructive read).

The components provide a pattern (or template) to access the data. If there is no match the component will wait until there is one.

Event-Based

An event-based architecture is based on the concept of events that are generated by components in the system.

Other components can subscribe to events and be notified when an event occurs.

Mobile Code

The Mobile Code architectural style allows for the dynamic movement of code or an entire running process across the network at runtime. This capability enhances a system’s flexibility, enabling it to receive new functionality without being halted or manually updated.

This style is often implemented in languages that run on a virtual machine (VM), such as Java or JavaScript, because the VM provides a crucial layer of abstraction and control over the executing environment.

The ability to receive and execute code from an external source is a major security risk.

Code on Demand

The client retrieves the executable code (or script) from a server and executes it locally.

Only the code itself moves (Weak Mobility).

An example is a web browser that downloads and executes JavaScript code from a web server.

Remote Evaluation

The client sends a request that is executed as code on the server.

The server performs the computation and returns only the result to the client (Weak Mobility).

An example is an SQL query that is sent to a database server for execution, and the results are returned to the client, or a cloud notebook like Google Colab where the code is executed on a remote server.

Mobile Agent

A process, including both its code and its complete execution state (data, program counter, stack), is suspended, migrated to a new machine, and then resumed (Strong Mobility).

Interactions

The behavior of distributed system is determined by a distributed algorithm: an algorithm executed collaboratively across multiple, independent machines. Due to the distribution, the algorithm must handle communication, synchronization, and fault tolerance.

A critical component of the performance and behavior of a distributed algorithm is the time: speed of the processes, the performance of the communication channel, and the clock drift (the time difference between the different machines) rates between machines.

There are two types of distributed systems based on time:

Pepperland Example

There are two generals on top of two hills. They need to sync who will lead an assault and when to start it. They can communicate on a reliable channel.

To choose the leader they could both choose a random number, the bigger on win.

In a async system is impossible to choose a time to charge as there is no bound on the time. One general could send a message to the other, but there is no guarantee that the message will arrive in a specific time. The other general could wait for the message, but there is no guarantee that it will arrive. The only solution is to charge immediately, but this could lead to a failure if the other general doesn’t charge at the same time.

Failures

In a distributed system, the key advantage is that failures are partial (one component fails, not the whole system), and the goal is to mask these failures from the client, achieving fault tolerance.

There are different types of failures:

Failure Detection

To detect a failure, a process can send a heartbeat message to another process. If the message is not received within a certain time, the process is considered failed.

In an asynchronous system, it’s impossible to distinguish between a slow process and a failed process. To mitigate this, a timeout can be used, but it can lead to false positives.

The same happens for unreliable channels, where a message can be lost or delayed, making hard to distinguish between a failed process and a lost message.

Ultima modifica:
Scritto da: Andrea Lunghi